Análise Automática de Coerência Usando o Modelo Grade de Entidades para o Português (Automatic Coherence Analysis Using the Entity-grid Model for Portuguese) [in Portuguese]

نویسندگان

  • Alison Rafael Polpeta Freitas
  • Valéria Delisandra Feltrim
چکیده

In this paper we investigate the applicability of Barzilay and Lapata’s (2008) entity-grid model in the evaluation of coherence in scientific abstracts written in Portuguese. More specifically, we focused on assessing whether such model could be employed in the implementation of a classifier capable of detecting linearity breaks that affect coherence. Our experimental results are close to those of the original entity-grid model for English and very similar to the results reported by related works for other languages. Results are also close to those obtained by human judges, showing that the entity-grid model can be applied in the investigated context. Resumo. Este artigo apresenta os resultados de uma investigação acerca da aplicabilidade do modelo grade de entidades proposto por Barzilay e Lapata (2008) na avaliação de coerência em resumos cientı́ficos escritos em português. Mais especificamente, se buscou avaliar se tal modelo poderia ser empregado na implementação de um classificador capaz de detectar quebras de linearidade que afetam a coerência dos resumos. Os resultados experimentais se mostraram próximos aos do modelo original para a lı́ngua inglesa e semelhantes aos relatados por trabalhos relacionados para outras lı́nguas. Os resultados também foram próximos ao obtido por juı́zes humanos, mostrando que o modelo grade de entidades tem potencial para ser aplicado no contexto investigado.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Análise Automática de Coerência Textual em Resumos Científicos: Avaliando Quebras de Linearidade (Automatic Analysis of Textual Coherence in Scientific Abstracts: Evaluating Linearity Breaks)

This paper presents an extension of the coherence analysis module that is part of the writing tool called SciPo, allowing it to automate the analysis of the coherence dimension called Linearity Break. The proposed implementation is based on a combination of the entity grid model and information from the rhetorical structure of scientific abstracts, allowing it to generate messages that indicate...

متن کامل

Uma abordagem de classificação automática para Tipo de Pergunta e Tipo de Resposta (An Automatic Approach for Classification of Question Type and Answer Type) [in Portuguese]

The question type classification and answer type classification are very important tasks for Question Answer Systems. This paper presents an automatic approach using machine learning for these tasks. We used decision trees as machine learning algorithm and 14 features developed using a tagger and a named entity systems. Resumo. A classificação de tipos de pergunta e tipo de resposta são tarefas...

متن کامل

PorTAl: Recursos e Ferramentas de Tradução Automática para o Português do Brasil (PorTAl: Resources and Tools for Machine Translation of Brazilian Portuguese) [in Portuguese]

This paper describes the machine translation (MT) site PorTAl developed aiming at integrating useful tools and resources for MT and the multilingual processing. Currently under development, the PorTAl will provide tools and resources for Brazilian Portuguese, English and Spanish (initially). In a near future we believe that the PorTAl will stimulate a progress in multilingual applications, part...

متن کامل

Processo de construção de um corpus anotado com Entidades Geológicas visando REN (Building an annotated corpus with geological entities for NER)[In Portuguese]

This article presents the building process of GeoCorpus, developed for the Geology domain, more specifically for the Bacia Sedimentar Brasileira subarea. The annotation is focused on Geological Entities in Portuguese text, and aims at Named Entity Recognition in the proposed domain. A case study validated both the annotation process and a tool which supported the specialists in the identificati...

متن کامل

RePort - Um Sistema de Extração de Informações Aberta para Língua Portuguesa (Report - An Open Information Extraction System for Portuguese Language)

An emerging field of research in Natural Language Processing (NLP) proposes Open Information Extraction systems (Open IE). Open IEs follow a domain-independent extraction paradigm that uses generic patterns to extract all relationships between entities. In this work, we present RePort, a method of Open IE for Portuguese, based on the ReVerb, an approach for English. Adaptations of syntactic and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013